P2P hardening: fix bucket refresh, pool contention, server RW timeouts, ignore filter & NodeList slice bug#145
Merged
mateeullahmalik merged 2 commits intomasterfrom Aug 31, 2025
Merged
Conversation
6d68526 to
2b1aa47
Compare
j-rafique
approved these changes
Aug 31, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
P2P hardening: fix bucket refresh, pool contention, server RW timeouts, ignore filter & NodeList slice bug
Summary
In testnet,
StoreArtefactsoccasionally hung and supernodes gradually “lost” the network (failing to discover/reconnect). This PR addresses several correctness and resiliency issues in the P2P/Kademlia layer that could cause handler leaks, global stalls during handshakes, incorrect LRU maintenance, futile re-dials to ignored/self nodes, and malformed peer lists.Changes
SetReadDeadline/SetWriteDeadline(~30s) inhandleConnFile:
supernode/p2p/kademlia/network.goconnPoolMtxacrossNewSecureClientConn(TLS/TCP handshake) using double-check patternFile:
supernode/p2p/kademlia/network.gorefreshNodewithnode.ID(raw), and no-op if not found to preserve LRUFiles:
supernode/p2p/kademlia/dht.go,supernode/p2p/kademlia/hashtable.goignoredMapbystring(ID)and look up with the same representation; also filterincludeNodeFile:
supernode/p2p/kademlia/hashtable.goNodeIDs()/NodeIPs()to allocate with capacity and append (no double-length / empty entries)File:
supernode/p2p/kademlia/node.goFile:
supernode/p2p/kademlia/network.goWhy this helps
StoreArtefactsStoreArtefacts/finds keep flowing).Risk / Mitigations
Test Plan